Talks and Workshops

Machine-Enabled Noun Foraging for OOUX

A talk at OOUX Happy Hour about machine-enabled noun foraging and extraction to set up proto-objects for use in the ORCA process.



Defining the right objects is critical to successful UX.

Noun foraging, the first step of the OOUX process and is a valuable activity in its own right, is all about finding those important objects — the things that our users actually care about — so that they can become the anchors of a valuable object-oriented user experience.

To make sure we are confident about our objects, we want to forage from a diverse landscape of resources.

We go foraging in user interview transcripts, competitor websites, our own existing systems, Wikipedia articles, and customer service chat logs.

However, foraging across reams of transcripts, case studies, and SME emails, can be a challenge.

This talk at Object-Oriented UX Happy Hour introduces a software-based method to mine diverse content sources.

Using a fun design challenge, we work to quickly reveal relevant nouns, find patterns, and establish a working list of nouns that are grounded in real data.



Resources, tools

Parts of speech tagging software

Parts of speech, or grammatical tagging, is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech based on both its definition and its context.

For OOUX, this means automatically finding + tagging nouns in text-based data using computational linguistic algorithms.

Software I’ve used:

  • Parts of Speech Tagger
    • a free, quick, simple website
    • video of where to find the noun download function
  • Sketch Engine
    • 30-day trial, more robust web application
    • can find and fetch text data by:
      • web search, 
      • URL(s) list,
      • website download (up to 10k webpages).
  • ChatGPT from

Need a data source to experiment with? The Dementia Guide [PDF] from Alzheimer’s Society is a single, domain definitive and noun-rich source with which to play.

Noun counting framework

A pre-built noun counting Google Sheet to:

  • clean, sanitise and normalise your nouns
  • count and rank frequencies of occurrence
  • aggregate and triangulate nouns from many sources

Machine-built content inventories

Crawling your own, a competitor’s or a domain’s definitive website can yield noun sources at scale quickly and cheaply. For example, from:

  • URLs (if they are well-designed and well-formed),
  • page titles,
  • headings, especially headings 1 and 2,
  • meta descriptions.

I use ScreamingFrog SEO Spider, which is freemium-downloaded software. You can crawl up to 500 URLs of a site for free or pay £149/year for an annual licence.

Here’s an example content inventory for Crate & Barrel. This took about 15 minutes to complete. 10 minutes of crawling and 5 minutes of tidying up the raw CSV download.

Further reading

By Rik Williams

I write about how to collaborate to design simple, usable and inclusive information experiences that make the lives of customers easier. Read more in Categories and Tags.

Leave a Reply

Your email address will not be published. Required fields are marked *