Skip to content
logo
Percona Server for MongoDB 6.0
Additional text search algorithm - ngram
Initializing search
    percona/psmdb-docs
    percona/psmdb-docs
    • Home
    • Percona Server for MongoDB feature comparison
      • System requirements
      • Install Percona Server for MongoDB on Debian and Ubuntu
      • Install Percona Server for MongoDB on Red Hat Enterprise Linux and derivatives
      • Install Percona Server for MongoDB from binary tarball
      • Run Percona Server for MongoDB in a Docker container
        • Percona Memory Engine
        • Hot Backup
        • $backupCursor and $backupCursorExtend aggregation stages
        • Authentication overview
        • Enable SCRAM authentication
        • Set up LDAP authentication with SASL
        • Set up x.509 authentication and LDAP authorization
        • Setting up Kerberos authentication
        • AWS IAM authentication
        • Setting up AWS IAM authentication
        • LDAP authorization
        • Set up LDAP authentication and authorization using NativeLDAP
        • Data at rest encryption
        • HashiCorp Vault integration
        • Using the Key Management Interoperability Protocol (KMIP)
        • Local key management using a keyfile
        • Migrate from key file encryption to HashiCorp Vault encryption
      • Auditing
      • Profiling rate limit
      • Log redaction
      • Additional text search algorithm - ngram
        • Usage
      • Tune parameters
        • Upgrade from 5.0 to 6.0
        • Upgrade Percona Server for MongoDB
      • Uninstall Percona Server for MongoDB
      • Release notes index
      • Percona Server for MongoDB 6.0.5-4 (2023-03-29)
      • Percona Server for MongoDB 6.0.4-3 (2023-01-30)
      • Percona Server for MongoDB 6.0.3-2 (2022-12-07)
      • Percona Server for MongoDB 6.0.2-1 (2022-10-31)
    • Glossary
    • Copyright and licensing information
    • Trademark policy

    • Usage

    Additional text search algorithm - ngram¶

    The ngram text search algorithm is useful for searching text for a specific string of characters in a field of a collection. This feature can be used to find exact sub-string matches, which provides an alternative to parsing text from languages other than the list of European languages already supported by MongoDB Community’s full text search engine. It may also turn out to be more convenient when working with the text where symbols like dash(‘-‘), underscore(‘_’), or slash(“/”) are not token delimiters.

    Unlike MongoDB full text search engine, ngram search algorithm uses only the following token delimiter characters that do not count as word characters in human languages:

    • Horizontal tab
    • Vertical tab
    • Line feed
    • Carriage return
    • Space

    The ngram text search is slower than MongoDB full text search.

    Usage¶

    To use ngram, create a text index on a collection setting the default_language parameter to ngram:

    > db.collection.createIndex({name:"text"}, {default_language: "ngram"})
    

    ngram search algorithm treats special characters like individual terms. Therefore, you don’t have to enclose the search string in escaped double quotes (\\") to query the text index. For example, to search for documents that contain the date 2021-02-12, specify the following:

    > db.collection.find({ $text: { $search: "2021-02-12" } })
    

    However, both ngram and MongoDB full text search engine treat words with the hyphen-minus - sign in front of them as negated (e.g. “-coffee”) and exclude such words from the search results.

    Contact Us

    For free technical help, visit the Percona Community Forum.

    To report bugs or submit feature requests, open a JIRA ticket.

    For paid support and managed or consulting services , contact Percona Sales.


    Last update: December 7, 2022
    Created: December 7, 2022
    Percona LLC and/or its affiliates, © 2023
    Made with Material for MkDocs

    Cookie consent

    We use cookies to recognize your repeated visits and preferences, as well as to measure the effectiveness of our documentation and whether users find what they're searching for. With your consent, you're helping us to make our documentation better.