LDAP: An index with a LIMIT-EXCEEDED condition causing an unindexed search


In order to complete requests more quickly, the directory server can maintain indexes for values of attributes. The value of the attribute is used as the key, and a list of entry IDs are the data. The entry IDs can be considered pointers to the entries that have that value for the attribute in question.

Unindexed searches cannot always be avoided or eradicated, but important searches by applications should be indexed wherever it is feasible to do so. Unindexed searches can take a long time, and can adversely impact other LDP clients connected to the server. Note that a search starting high up in the directory information tree with subtree scope and filter “(objectClass=*)” or “(objectClass=top)” will almost always be unindexed, depending on where the base object is located in the tree, since every entry has an objectClass, and every entry has the objectClass “top”.

Concepts

  • indexes
  • dbtest list-all
  • dbtest dump-database-container
  • cn=monitor
  • unindexed search
  • inspecting indexes
  • debugsearchindex
  • index-entry-limit
  • verify-index

Goals

  • understand what indexes the UnboundID Directory Server utilizes
  • understand how indexes work in the UnboundID Directory server
  • understand how a LIMIT-EXCEEDED condition can result in an unindexed search
  • understand how to correct an index with the LIMIT-EXCEEDED condition for an attribute value

Indexes

In order to complete requests more quickly, the directory server can maintain indexes for values of attributes. The directory server s uses the following types of indexes:

  • system indexes
  • local DB indexes
  • local VLV indexes
  • filtered indexes

Attributes can indexed for:

NAME DESCRIPTION EXAMPLE FILTER
Approximate Match An approxMatch filter is TRUE when there is a value of the attribute type or subtype for which some locally-defined approximate matching algorithm (e.g., spelling variations, phonetic match, etc.) returns TRUE. If a value matches for equality, it also satisfies an approximate match. If approximate matching is not supported for the attribute, this filter item should be treated as an equalityMatch. (cn=~gardner)
Equality Match The matching rule for an equalityMatch filter is defined by the
EQUALITY matching rule for the attribute type or subtype. The filter
is TRUE when the EQUALITY rule returns TRUE as applied to the
attribute or subtype and the asserted value.
(cn=gardner)
Substring There SHALL be at most one ‘initial’ and at most one ‘final’ in the
‘substrings’ of a SubstringFilter. If ‘initial’ is present, it SHALL
be the first element of ‘substrings’. If ‘final’ is present, it
SHALL be the last element of ‘substrings’.

The matching rule for an AssertionValue in a substrings filter item
is defined by the SUBSTR matching rule for the attribute type or
subtype. The filter is TRUE when the SUBSTR rule returns TRUE as
applied to the attribute or subtype and the asserted value.

Note that the AssertionValue in a substrings filter item conforms
to the assertion syntax of the EQUALITY matching rule for the
attribute type rather than to the assertion syntax of the SUBSTR
matching rule for the attribute type. Conceptually, the entire
SubstringFilter is converted into an assertion value of the
substrings matching rule prior to applying the rule.

(cn=gard*)
Ordering Used for lexicographic (relative value) matching in GE and LE filters. (cn>=gardner), (cn<=gardner)
Presence A present filter is TRUE when there is an attribute or subtype of the
specified attribute description present in an entry, FALSE when no
attribute or subtype of the specified attribute description is
present in an entry, and Undefined otherwise.
(cn=*)

Format of Index Databases

The directory server indexes are stored in databases in the backend database environment. Each index database contains key-value pairs where the key is the value of the attribute as found in the database and the value is a list of entry identifiers that resolve to distinguished names in the database each of which contains the attribute value. This diagram shows the relationship for the value 'Atp' of the attribute 'cn':

Indexes

The diagram shows that the attribute ‘sn’ has a value ‘atp’ which occurs in the stated list of entry identifiers and shows the path for entry identifier 4. The value of the attribute is used as the key, and a list of entry IDs are the data. The entry IDs can be considered pointers to the entries that have that value for the attribute in question.

When a search takes longer than expected or the search is flagged as unindexed in the access log, start looking at indexes. When the number of entries that contain a value of an attribute exceeds the limit on the number of entries for that index, then a LIMIT-EXCEEDED condition exceeds, and no pointers to entries are maintained for that value. The UnboundID Directory Server makes the task of refactoring indexes very easy by providing a tool detest which takes as a subcommand ‘dump-database-container’. The output of ‘dbtest dump database-container’ tells the administrator exactly which value is causing the LIMIT-EXCEEDED condition.

verify-index

The example below is an example of verifying an index:

$ bin/verify-index --baseDn 'dc=example,dc=com' --clean --index sn
[16:05:26]  The console logging output is also available in '/Users/terrygardner/servers/ds/1389/logs/tools/verify-index.log'
[16:05:37]  Processed 45611 out of 100003 records and found 0 error(s) (recent rate 4557.9/sec)
[16:05:37]  Free memory = 674 MB, Cache miss rate = 3.2/record
[16:05:38]  Checked 55127 records and found 0 error(s) in 11 seconds (average rate 4805.4/sec)
[16:05:38]  Number of records referencing more than one entry: 55098
[16:05:38]  Number of records that exceed the entry limit: 14
[16:05:38]  Average number of entries referenced is 12.56/record
[16:05:38]  Maximum number of entries referenced by any record is 3962

ds-index-exceeded-entry-limit-since-db-open

The attribute ds-index-exceeded-entry-limit-count-since-db-open is a count of the number of index entry limit exceeds since the database was opened. The example below is an example of retrieving the ds-index-exceeded-entry-limit-count-since-db-open attribute from the cn=monitor index entries:

$ ldapsearch --nopropertiesFile  -h localhost -p 1389 \
  -D cn=rootdn -j ~/.pwdFile -b cn=monitor -s one \
   '(&(objectClass=ds-index-monitor-entry)(ds-index-exceeded-entry-limit-count-since-db-open>=1))’ \
   ds-index-exceeded-entry-limit-count-since-db-open
dn: cn=Index dc_example_dc_com_objectClass.equality,cn=monitor
ds-index-exceeded-entry-limit-count-since-db-open: 2

dn: cn=Index dc_example_dc_com_sn.equality,cn=monitor
ds-index-exceeded-entry-limit-count-since-db-open: 1

dn: cn=Index dc_example_dc_com_sn.substring,cn=monitor
ds-index-exceeded-entry-limit-count-since-db-open: 23

Example Index with a LIMIT-EXCEEDED Condition

The illustration below depicts a portion of the equality index for the sn attribute. For example, the value “abedi” appears as the value of the sn attribute in 12 entries with IDs 6, 10, 13, 16, 27, 37, 45, 78, 82, 92, and 97. The value “abel” appears in one entry whose ID is 11. The value “abe” appears in a number of entries where that number exceeds the “index-entry-limit” for the sn attribute and is marked accordingly with the LIMIT-EXCEEDED notation. Searches that utilize an index value that has the LIMIT-EXCEEDED condition will be unindexed searches, since the IDs are not present.

Unindexed searches cannot always be avoided or eradicated, but important searches by applications should be indexed wherever it’s feasible to do so. Unindexed searches can take a long time, and can adversely impact other LDAP clients connected to the server. Note that a search starting high up in the directory information tree with subtree scope and filter “(objectClass=*)” or “(objectClass=top)” will almost always be unindexed, depending on where the base object is located in the tree, since every entry has an objectClass, and every entry has an objectClass attribute with the value “top”.

The index-entry-limit has a default value; this default value can be modified. As shipped, the directory server configuration uses 4000 for the default index-entry-limit. The maximum number of IDs per value is capped by the parameter “index-entry-limit”. The “index-entry-limit” may be set for each attribute individually. Only attributes that appear in the schema can be indexed. Virtual attributes, however, which do appear in the schema cannot be indexed.

Indexes full

To display the default index-entry-limit for a backend, execute the following command, replacing the command line arguments with your local values:

$ dsconfig -h localhost -p 1389 -D cn=rootdn -j ~/.pwdFile --useStartTls --trustAll --no-prompt get-backend-prop --backend-name userRoot --property index-entry-limit
Property          : Value(s)
------------------:---------
index-entry-limit : 4000

To display the index-entry-limit for the sn index, execute the following command, replacing the command line arguments with your local values:

$ dsconfig -h localhost -p 1389 -D cn=rootdn -j ~/.pwdFile --useStartTls --trustAll --no-prompt get-local-db-index-prop --property index-entry-limit --backend-name userRoot --index-name sn
Property          : Value(s)
------------------:---------
index-entry-limit : 100

Correcting the sn index

debugsearchindex

The ‘debugsearchindex’ attribute reports information on the processing of indexes. Use the ‘debugsearchindex’ attribute:

ldapsearch -h localhost -p 1389 -b 'dc=example,dc=com' -s sub -D cn=rootdn -j ~/.pwdFile '(sn=abe)' debugsearchindex

Arguments from tool properties file:  --useStartTLS true --trustAll true

dn: cn=debugsearch
debugsearchindex: filter=(sn=abe)[INDEX:sn.equality][LIMIT-EXCEEDED] scope=whole
 Subtree[COUNT:2006] final=[COUNT:2006]

This output indicates that a limit was exceeded, in this case, the sn.equality index for the value ‘abe’.

Count the number of entries matching the filter ‘(sn=abe)’:

$ ldapsearch -h localhost -p 1389 -b 'ou=people,dc=example,dc=com' -s one -D cn=rootdn -j ~/.pwdFile --countEntries '(sn=abe)' 1.1 | tail
dn: uid=user.1987,ou=People,dc=example,dc=com

dn: uid=user.1997,ou=People,dc=example,dc=com

dn: uid=user.1999,ou=People,dc=example,dc=com

dn: uid=user.1998,ou=People,dc=example,dc=com

# Total number of matching entries: 2000

Since the sn index has the index-entry-limit property set to 100, we know that there are more than 100 entries with an sn attribute with value “abe”. A search that uses the filter “(sn=abe)” will be an unindexed search, adversely impacting directory server performance as well as other clients of the directory server. Since the LIMIT-EXCEEDED condition exists for “abe”, the search to count the number of entries that match “(sn=abe)” will be unindexed, and will have to be executed by an authorization state that has the unindexed-search privilege. This search may take long time, depending on the number of entries that match the search parameters.

Unindexed searches can be disallowed by the directory server.

List the databases, looking for the sn.equality index database:

$ dbtest list-all | perl -lane 'print if /sn.equality/'
sn.equality                  Index          dc_example_dc_com_sn.equality                  12002        TRUSTED

Dump Database Container

Dump the sn.equality database container:

$ dbtest dump-database-container --backendId userRoot --baseDn 'dc=example,dc=com' --databaseName sn.equality | grep -B 1 LIMIT-EXCEEDED
Indexed Value (3 bytes): abe
Entry ID List (0 bytes): [LIMIT-EXCEEDED]

Knowing that there are 2000 entries that match the filter ‘(sn=abe)’, set the index-entry-limit for the sn.equality index to 4000 (which is the default, it was lowered to 100 for this demonstration in order to create a LIMIT-EXCEEDED condition).

dsconfig -h localhost -p 1389 -D cn=rootdn -w password --useStartTls --trustAll --no-prompt set-local-db-index-prop --backend-name userRoot --index-name sn --set index-entry-limit:4000

One or more configuration property changes require administrative action or confirmation/notification.  Those properties include:

 * index-entry-limit:  If any index keys have already reached this limit, indexes must be rebuilt before they will be allowed to use the new limit.

The Local DB Index was modified successfully

Rebuild Index

The index-entry-limit for sn has now been set. Rebuild the index:

$ rebuild-index -h localhost -p 1389 -D cn=rootdn -j ~/.pwdFile --useStartTLS --trustAll -t 0 --baseDn dc=example,dc=com --index sn --completionNotify terry.gardner@unboundid.com --errorNotify terry.gardner@unboundid.com --task
[16:57:13]  The console logging output is also available in '/Users/terrygardner/servers/ds/1389/logs/tools/rebuild-index.log'
Rebuild Index task 2011121816571312 scheduled to start Dec 18, 2011 4:57:13 PM EST

Dump the database container after the rebuild is complete:

$ dbtest dump-database-container --backendId userRoot --baseDn 'dc=example,dc=com' --databaseName sn.equality | grep -B 1 LIMIT-EXCEEDED

After correction, no records have the LIMIT-EXCEEDED flag.

This post is alo available at LDAPguru.info.

About Terry Gardner

Terry Gardner was a leading directory services architect with experience with many large scale directory services installations and messaging server installations, and was a Subject Matter Expert in the field of Directory Services and Solaris (operating system) performance. Mr. Gardner also participated in the open-source software community. Mr. Gardner passed away in December, 2013.
This entry was posted in LDAP, UnboundID and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s