A Step-by-Step Guide to Removing Outdated Taxonomies from Your Algolia Search

Hey there! As your application collects more user data and content over time, you may find your Algolia search index getting cluttered with outdated or redundant filters. Taxonomies are a powerful tool for slicing and dicing information in Algolia. But inaccuracies or duplications can confuse your users and developers. So let‘s walk through how to cleanly delete taxonomies you no longer need.

Why So Many Sites Rely on Algolia for Search

Before we dive into taxonomy management, it‘s worth discussing why Algolia has become one of the most trusted platforms for search. User expectations for finding relevant information have skyrocketed. According to Google research, over 70% of mobile users expect results within 5 seconds or they will abandon a site.

Fortunately, Algolia is built precisely for the performance that modern applications require. Some impressive stats about their growth:

  • Over 10,500+ customers from Airbnb to Zendesk
  • Peak search rate of over 35,000 queries per second
  • 99.95% uptime and results under 30 milliseconds

No wonder they handle over 5 billion searches a day across customer sites!

With the search data volumes many Algolia apps handle, properly organizing your records with taxonomies can be critical so you can analyze popularity, promote discoveries, and customize display.

When To Consider Cleansing Taxonomies

So what exactly are taxonomies within your search index? Taxonomies label records according to categories like product type, content tags, app screens, author, ranking, language, etc.

You may find reasons to consolidate taxonomies if:

  • Translations or localized content get outdated
  • Category schemes change over time
  • You switch to standardized taxonomies like schema.org
  • Duplicate taxonomy values have slightly different spellings
  • Old groupings or filters that are no longer necessary

In one customer‘s case, Algolia‘s CEO shared how they reduced redundant taxonomies from 1200 to just 67! So let‘s explore how you can programmatically delete records by taxonomy.

Preparing For Taxonomy Deletion

Before hastily cleaning out taxonomy values, let‘s gather what we need:

Step 1) Log into your Algolia account to access details like your Application ID, API Keys for admin access, and names of indices like default_search.

Step 2) Optionally, I recommend cloning your production index into a temporary test_index. That isolates changes from impacting real users until verified.

In code, that looks like:

client.copyIndex(‘default_search‘, ‘test_index‘);

Step 3) Review which taxonomy fields exist and the variety of values used. Taxonomies display on the Filtering & Faceting page.

Now we‘re ready to safely validate and remove unneeded taxonomy records without disruptions!

Double Checking Taxonomies To Remove

Responsibly deleting data is all about precise targeting. Let‘s outline how to carefully query and confirm which taxonomy-labeled records would actually be removed:

client.initIndex(‘test_index‘); 

index.search(‘‘, {
  filters: ‘taxonomies.genre: Jazz‘  
})

We initialize our temporary index then search on an empty query string. The filters parameter targets just records tagged with taxonomy genre as Jazz.

Review the returned hits to verify they match expectations before executing permanent deletion!

Deleting Taxonomies by Query Filter

Once satisfied with the intended records, we leverage Algolia‘s deleteByQuery operation:

index.deleteByQuery({
  filters: ‘taxonomies.genre: Jazz‘
});

And voila! All records matching that filter vanish in milliseconds thanks to Algolia‘s speed.

We could have also passed an empty string for query and filtered solely based on taxonomy values. This is often simpler than attempting to assemble complex search queries.

How Other Developers Delete Records

In addition to deleteByQuery, Algolia provides other deletion approaches:

  • deleteObject – Pass in a single record‘s objectID
  • deleteObjects – Supply an array of objectIDs to bulk delete
  • clearObjects – Wipe an entire index by removing all objects

Each approach has tradeoffs:

Method Scope Speed Requires
Delete by Query Broad, Filtered Batch Slowest Filter Logic
Delete by ID Individual Record Fast Record ID
Batch Delete by ID Batch Records Medium Record IDs
Clear All Entire Index Fastest Dangerous!

So in summary:

  • Delete by query when you need to prune many records filtered a certain way
  • Delete by ID for precise control over individual records
  • Clear objects as an absolute last resort since you lose everything!

Now that you know the options, what‘s the next step for your Algolia instance?

Additional Best Practices In Managing Taxonomies

Since taxonomies help organize relevant connections in your data, it‘s important to model them accurately aligned to your application‘s domain.

Consider if taxonomy terms could be standardized using conventions like Schema.org for broader interoperability.

I also recommend documenting your taxonomies and changes using Algolia‘s query rules which won‘t modify data itself but explain transformations.

And before removing historical taxonomies, evaluate if they have business value for analytics. Analyzing popularity changes in tags and categories over time can uncover hidden insights!

Wrap Up

I hope walking through how to delete taxonomies by query helps you simplify outdated metadata and filters cluttering your Algolia search results! Adapt these snippets as you find necessary to consolidate stale groupings.

And consider cloning production indices to safely test changes away from real users in preview modes. Algolia presents many options to keep your search relevance sharp.

Let me know if you have any other questions on effective search data modeling! I‘m always happy to help explain best practices.

Tags: